-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[RISCV] Allow large div peephole optimization for minsize #163679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-backend-risc-v Author: None (kvpanchenko) ChangesWhen Full diff: https://github.com/llvm/llvm-project/pull/163679.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7123a2d706787..a82020a846949 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -24828,7 +24828,7 @@ bool RISCVTargetLowering::isIntDivCheap(EVT VT, AttributeList Attr) const {
// instruction, as it is usually smaller than the alternative sequence.
// TODO: Add vector division?
bool OptSize = Attr.hasFnAttr(Attribute::MinSize);
- return OptSize && !VT.isVector();
+ return OptSize && !VT.isVector() && VT.getSizeInBits() <= 128;
}
bool RISCVTargetLowering::preferScalarizeSplat(SDNode *N) const {
diff --git a/llvm/test/CodeGen/RISCV/div_minsize.ll b/llvm/test/CodeGen/RISCV/div_minsize.ll
index 601821b932a52..69acc59f3859f 100644
--- a/llvm/test/CodeGen/RISCV/div_minsize.ll
+++ b/llvm/test/CodeGen/RISCV/div_minsize.ll
@@ -68,3 +68,155 @@ define i32 @testsize4(i32 %x) minsize nounwind {
%div = udiv i32 %x, 33
ret i32 %div
}
+
+define i128 @i128_sdiv(i128 %arg0) minsize nounwind {
+; RV32IM-LABEL: i128_sdiv:
+; RV32IM: # %bb.0:
+; RV32IM-NEXT: addi sp, sp, -64
+; RV32IM-NEXT: sw ra, 60(sp) # 4-byte Folded Spill
+; RV32IM-NEXT: sw s0, 56(sp) # 4-byte Folded Spill
+; RV32IM-NEXT: lw a3, 0(a1)
+; RV32IM-NEXT: lw a4, 4(a1)
+; RV32IM-NEXT: lw a5, 8(a1)
+; RV32IM-NEXT: lw a6, 12(a1)
+; RV32IM-NEXT: mv s0, a0
+; RV32IM-NEXT: li a7, 4
+; RV32IM-NEXT: addi a0, sp, 40
+; RV32IM-NEXT: addi a1, sp, 24
+; RV32IM-NEXT: addi a2, sp, 8
+; RV32IM-NEXT: sw a7, 8(sp)
+; RV32IM-NEXT: sw zero, 12(sp)
+; RV32IM-NEXT: sw zero, 16(sp)
+; RV32IM-NEXT: sw zero, 20(sp)
+; RV32IM-NEXT: sw a3, 24(sp)
+; RV32IM-NEXT: sw a4, 28(sp)
+; RV32IM-NEXT: sw a5, 32(sp)
+; RV32IM-NEXT: sw a6, 36(sp)
+; RV32IM-NEXT: call __divti3
+; RV32IM-NEXT: lw a0, 40(sp)
+; RV32IM-NEXT: lw a1, 44(sp)
+; RV32IM-NEXT: lw a2, 48(sp)
+; RV32IM-NEXT: lw a3, 52(sp)
+; RV32IM-NEXT: sw a0, 0(s0)
+; RV32IM-NEXT: sw a1, 4(s0)
+; RV32IM-NEXT: sw a2, 8(s0)
+; RV32IM-NEXT: sw a3, 12(s0)
+; RV32IM-NEXT: lw ra, 60(sp) # 4-byte Folded Reload
+; RV32IM-NEXT: lw s0, 56(sp) # 4-byte Folded Reload
+; RV32IM-NEXT: addi sp, sp, 64
+; RV32IM-NEXT: ret
+;
+; RV64IM-LABEL: i128_sdiv:
+; RV64IM: # %bb.0:
+; RV64IM-NEXT: addi sp, sp, -16
+; RV64IM-NEXT: sd ra, 8(sp) # 8-byte Folded Spill
+; RV64IM-NEXT: li a2, 4
+; RV64IM-NEXT: li a3, 0
+; RV64IM-NEXT: call __divti3
+; RV64IM-NEXT: ld ra, 8(sp) # 8-byte Folded Reload
+; RV64IM-NEXT: addi sp, sp, 16
+; RV64IM-NEXT: ret
+ %div = sdiv i128 %arg0, 4
+ ret i128 %div
+}
+
+define i256 @i256_sdiv(i256 %arg0) minsize nounwind {
+; RV32IM-LABEL: i256_sdiv:
+; RV32IM: # %bb.0:
+; RV32IM-NEXT: lw a5, 16(a1)
+; RV32IM-NEXT: lw a4, 20(a1)
+; RV32IM-NEXT: lw a2, 24(a1)
+; RV32IM-NEXT: lw a3, 28(a1)
+; RV32IM-NEXT: lw a6, 0(a1)
+; RV32IM-NEXT: lw a7, 4(a1)
+; RV32IM-NEXT: lw t0, 8(a1)
+; RV32IM-NEXT: lw t1, 12(a1)
+; RV32IM-NEXT: srai a1, a3, 31
+; RV32IM-NEXT: srli a1, a1, 30
+; RV32IM-NEXT: add a1, a6, a1
+; RV32IM-NEXT: sltu t2, a1, a6
+; RV32IM-NEXT: add a6, a7, t2
+; RV32IM-NEXT: sltu a7, a6, a7
+; RV32IM-NEXT: and t2, t2, a7
+; RV32IM-NEXT: add a7, t0, t2
+; RV32IM-NEXT: sltu t3, a7, t0
+; RV32IM-NEXT: add t0, t1, t3
+; RV32IM-NEXT: beqz t2, .LBB5_2
+; RV32IM-NEXT: # %bb.1:
+; RV32IM-NEXT: sltu t1, t0, t1
+; RV32IM-NEXT: and t2, t3, t1
+; RV32IM-NEXT: .LBB5_2:
+; RV32IM-NEXT: add t2, a5, t2
+; RV32IM-NEXT: srli t1, t0, 2
+; RV32IM-NEXT: srli t3, a7, 2
+; RV32IM-NEXT: slli t0, t0, 30
+; RV32IM-NEXT: slli a7, a7, 30
+; RV32IM-NEXT: or t0, t3, t0
+; RV32IM-NEXT: srli t3, a6, 2
+; RV32IM-NEXT: srli a1, a1, 2
+; RV32IM-NEXT: slli a6, a6, 30
+; RV32IM-NEXT: sltu a5, t2, a5
+; RV32IM-NEXT: or a7, t3, a7
+; RV32IM-NEXT: srli t3, t2, 2
+; RV32IM-NEXT: slli t2, t2, 30
+; RV32IM-NEXT: or a1, a1, a6
+; RV32IM-NEXT: add a6, a4, a5
+; RV32IM-NEXT: or t1, t1, t2
+; RV32IM-NEXT: sltu a4, a6, a4
+; RV32IM-NEXT: srli t2, a6, 2
+; RV32IM-NEXT: slli a6, a6, 30
+; RV32IM-NEXT: sw a1, 0(a0)
+; RV32IM-NEXT: sw a7, 4(a0)
+; RV32IM-NEXT: sw t0, 8(a0)
+; RV32IM-NEXT: sw t1, 12(a0)
+; RV32IM-NEXT: and a4, a5, a4
+; RV32IM-NEXT: or a1, t3, a6
+; RV32IM-NEXT: add a4, a2, a4
+; RV32IM-NEXT: srli a5, a4, 2
+; RV32IM-NEXT: sltu a2, a4, a2
+; RV32IM-NEXT: slli a4, a4, 30
+; RV32IM-NEXT: add a2, a3, a2
+; RV32IM-NEXT: or a3, t2, a4
+; RV32IM-NEXT: slli a4, a2, 30
+; RV32IM-NEXT: srai a2, a2, 2
+; RV32IM-NEXT: or a4, a5, a4
+; RV32IM-NEXT: sw a1, 16(a0)
+; RV32IM-NEXT: sw a3, 20(a0)
+; RV32IM-NEXT: sw a4, 24(a0)
+; RV32IM-NEXT: sw a2, 28(a0)
+; RV32IM-NEXT: ret
+;
+; RV64IM-LABEL: i256_sdiv:
+; RV64IM: # %bb.0:
+; RV64IM-NEXT: ld a2, 24(a1)
+; RV64IM-NEXT: ld a3, 16(a1)
+; RV64IM-NEXT: ld a4, 0(a1)
+; RV64IM-NEXT: ld a1, 8(a1)
+; RV64IM-NEXT: srai a5, a2, 63
+; RV64IM-NEXT: srli a5, a5, 62
+; RV64IM-NEXT: add a5, a4, a5
+; RV64IM-NEXT: sltu a4, a5, a4
+; RV64IM-NEXT: srli a5, a5, 2
+; RV64IM-NEXT: add a6, a1, a4
+; RV64IM-NEXT: sltu a1, a6, a1
+; RV64IM-NEXT: and a1, a4, a1
+; RV64IM-NEXT: srli a4, a6, 2
+; RV64IM-NEXT: slli a6, a6, 62
+; RV64IM-NEXT: or a5, a5, a6
+; RV64IM-NEXT: add a1, a3, a1
+; RV64IM-NEXT: srli a6, a1, 2
+; RV64IM-NEXT: sltu a3, a1, a3
+; RV64IM-NEXT: slli a1, a1, 62
+; RV64IM-NEXT: add a2, a2, a3
+; RV64IM-NEXT: or a1, a4, a1
+; RV64IM-NEXT: slli a3, a2, 62
+; RV64IM-NEXT: srai a2, a2, 2
+; RV64IM-NEXT: or a3, a6, a3
+; RV64IM-NEXT: sd a5, 0(a0)
+; RV64IM-NEXT: sd a1, 8(a0)
+; RV64IM-NEXT: sd a3, 16(a0)
+; RV64IM-NEXT: sd a2, 24(a0)
+; RV64IM-NEXT: ret
+ %div = sdiv i256 %arg0, 4
+ ret i256 %div
+}
|
@topperc general question: is it possible to backport to old releases, such as 18x ? |
As far as I know, we only do backports to the most recent release branch. |
10e35be
to
fa024a6
Compare
@topperc if that looks good now, can you please merge it ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Can you provide an email address per LLVM Developer Policy? https://llvm.org/docs/DeveloperPolicy.html#email-addresses |
fa024a6
to
eee262a
Compare
ops. Added. |
I'm still just seeing an noreply github email address. |
When `minsize` function attribute is set, division of large integers by power-of-2 are not optimized as it's expected by ExpandLargeDivRem pass, which results to compiler crash
eee262a
to
8a34566
Compare
hm... how about now ? |
I see it in your commit now, but the github squash button still wants to use the noreply email as the author. I can put your email in a Co-authored-by comment. |
that works for me. Thanks |
@kvpanch Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
When
minsize
function attribute is set, division of large integers by power-of-2 is not optimized as it's expected by ExpandLargeDivRem pass, which results to compiler crash